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REMARKS 

The Office Action mailed on July 13, 2006 has been carefully reviewed, and these 
remarks are responsive thereto. Claims 1-3 and 23-34 are currently pending in this application. 
By this amendment, Claims 1-3 have been cancelled. Claims 23-34 have been amended. Claims 
35-39 have been added. Thus, upon entry of the above-specified amendments, Claims 23-39 are 
presented for further examination. 

Discussion of Claim Amendments 

Applicant has amended Claims 23-34. The amendment to Claim 23 clarifies that the 
generated speech is based on received text data. The amendment is not intended to narrow the 
scope of the claim. Claims 24-34 have been amended to depend (either directly or indirectly) 
from Claim 23 instead of canceled Claim 1. New Claims 35-39 are drawn to substantially 
similar subject matter as the elected claims. These claims are fully supported in the application 
as filed. 

Discussion of Rejections Under 35 U.S.C. $ 1 12, f 2 

Claim 3 is rejected under 35 U.S.C. § 112, second paragraph as being indefinite. 
Application has canceled Claim 3, thereby rendering this rejection moot. 

Discussion of Rejections Under 35 U.S.C. § 102(b) 

Claims 1-3, 23-29, and 31-34 stand rejected under 35 U.S.C. § 102(b) as being 
anticipated by U.S. Pat. No. 5,426,460 to Erving. As noted previously, Applicant has canceled 
Claims 1-3. Therefore, the rejection of these claims is moot. 

With respect to Claim 23, the Examiner states that Erving teaches a multimedia client 
terminal (320) adapted to generate graphical image data defining a facial image which is 
animated to simulate actions accompanying speech at column 3, lines 30-37. The Examiner 
further states that the Erving describes, at column 1, lines 46-63 and col. 3, lines 3-37, the recited 
feature of "a receiver arranged to receive signals for processing by the terminal so as to retrieve 
data therein, said signals comprising text defining the speech to be spoken and command data 
defining animations to accompany said speech." The Examiner goes on to state that the 
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remaining features recited in Claim 23 may be found at column 4, line 51 through column 5, line 
11. 

Claim 23, as amended, recites in relevant part (emphasis added): 



A multimedia client terminal adapted to generate graphical image 
data defining a facial image which is animated to simulate actions 
accompanying speech, the terminal comprising: 

a receiver arranged to receive signals for processing by the terminal 
so as to retrieve data therein, said retrieved data comprising text 
defining the speech to be spoken and command data defining 
animations to accompany said speech; 

animated speech generating means responsive to said retrieved 
text and command data so as to generate said speech from said text 
and to select said facial animations and use said data defining 
facial characteristics to animate said speech. 



Applicant respectfully submits that Erving fails to teach or suggest at least the recited features of 
"retrieved data comprising text defining the speech to be spoken" and "generating said speech 
from said text" as recited in Claim 23. 

In particular, Applicant submits that Erving does not disclose generating speech and 
associated animation from text data , but instead describes only a system in which speech and 
animation is generated based on speech signals which have passed over a communications 
network. For example, at column 2, lines 20-25, Erving states: 



In a particular implementation of the invention, formants are 
extracted from the vocal tract frequency responses of a particular 
vocal tract model responsive to a received voice signal over a 
telecommunication channel. The formants are associated with 
phonics using maximum likelihood techniques. The phonic 
information is then converted into lip motion in the idealized lip 
image. 

In a particular embodiment a representative still image of a caller is 
transmitted over a voice channel at the creation of a call. Selected 
facial movements of that image are continuously updated for the 
duration of the call by deriving information from a voice coded 
signal transmitting the voice message. 



As is apparent from this passage, the "speech" provided in Erving is merely the voice data 
provided in a telephone call. No text data is received, nor is text data involved in the process in 



-6- 



Appl. No. 
Filed 



10/607,672 
June 27, 2003 



any respect. As Erving fails to teach each feature recited in the claim, Applicant respectfully 
submits that the rejection of Claim 23 under Erving is improper and should be withdrawn. 

Claim 34 also stands rejected as being anticipated by Erving. Claim 34, as amended, 
recites in relevant part (emphasis added): 



a receiver arranged to receive signals for processing by the terminal 
so as to retrieve data therein, said retrieved data comprising data 
defining the speech to be spoken and command data defining 
animations to accompany said speech, said command data being 
independent of said data defining the speech to be spoken ; 



Applicant respectfully submits that Claim 34 is allowable over Erving because Erving fails to 
teach at least the recited feature of command data being independent of data defining speech to 
be spoken. In particular, any command data that may be present in Erving is directly related to 
the data defining the speech to be spoken, i.e., the voice signals. This is illustrated, for example, 
in the passage of Erving from col. 3, lines 32-37, which states: 



The screen 301 has a special screen area 307 devoted to displaying 
the lips of the facial image of the caller calling the handset. This 
area includes a special controllable image which is controlled to 
reproduce or animate the lip motion in response to the received 
speech signal of the caller. 



As is clear from this example, in contrast to the recited claim element, the lip motion is animated 
in response to the speech signal. The command data is further shown to be related to (i.e. not 
independent of) the data defining the speech to be spoken at column 4, lines 9-20: 



The input speech is also applied to the linear predictive coder 
processor 420, which provides filter coefficients which are 
representative of the resonant peaks or formants of the input 
speech. The LPC filter coefficients are :related to the lip 
movements producing the speech. These filter coefficients and 
indices are transmitted as the total voice speech signal to a 
telecommunications receiver. The receiver includes a phonic code 
table which equates these filter coefficients to synthesized mouth 
movements which are used to control the stylized image presented 
in the lip area of the caller as displayed at the receiver. An imaging 
system utilizes the tabular output of the phonic code table to 
activate mouth movements of the stylized mouth image. 



As shown in this passage, the LPC filter coefficients, which are transmitted with the voice signal, 
are directly based on the sounds being transmitted. As such, the command data cannot be 
independent of the data defining the speech as recited in the claim. 
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Claim 24-29 and 31-34 (as well as new Claims 35-39) each are dependent from Claim 23 
either directly or indirectly. By not providing arguments as to the characterization of the prior art 
vis-a-vis the dependent claims, it does not imply that Applicant agrees with these 
characterizations. Applicant respectfully submits that pursuant to 35 U.S.C. § 112, f 4, the 
dependent claims incorporate by reference all the limitations of the claim to which they refer and 
include their own patentable features, and are therefore in condition for allowance. 

Discussion of Rejections Under 35 U.S.C. $ 103(a) 

Claim 30 stands rejected under 35 U.S.C. § 103(a) as being obvious over Erving. Claim 
30 depends from Claim 23 and in accordance with 35 U.S.C. § 112, H 4, includes each of its 
recited features. As discussed above, Erving fails to teach or disclose at least the features of 
"retrieving] data comprising text defining the speech to be spoken," and "generat[ing] said 
speech from said text." The Examiner proposes no modifications which would cure these 
deficiencies. As a result, Applicant respectfully submits that Claim 30 is allowable. 
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CONCLUSION 

In view of the foregoing, Applicant respectfully requests reconsideration and withdrawal 
of the outstanding rejections and, particularly, that all claims be allowed. If the Examiner finds 
any remaining impediment to the prompt allowance of these claims that could be clarified with a 
telephone conference, the Examiner is respectfully invited to call the undersigned. 

Please charge any additional fees, including any fees for additional extension of time, or 
credit overpayment to Deposit Account No. 11-1410. 



Dated: October 13,2006 
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